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Abstract 

Background: The mosquito, Anopheles gambiae, is the primary vector of human malaria, a disease responsible for 
millions of deaths each year. To improve strategies for controlling transmission of the causative parasite, 
Plasmodium falciparum, we require a thorough understanding of the developmental mechanisms, physiological 
processes and evolutionary pressures affecting life-history traits in the mosquito. Identifying genes expressed in 
particular tissues or involved in specific biological processes is an essential part of this process. 

Results: In this study, we present transcription profiles for -82% of annotated Anopheles genes in dissected adult 
male and female tissues. The sensitivity afforded by examining dissected tissues found gene activity in an 
additional 20% of the genome that is undetected when using whole-animal samples. The somatic and 
reproductive tissues we examined each displayed patterns of sexually dimorphic and tissue-specific expression. 
By comparing expression profiles with Drosophila melanogaster we also assessed which genes are well conserved 
within the Diptera versus those that are more recently evolved. 

Conclusions: Our expression atlas and associated publicly available database, the MozAtlas (http://www.tissue-atlas. 
org), provides information on the relative strength and specificity of gene expression in several somatic and 
reproductive tissues, isolated from a single strain grown under uniform conditions. The data will serve as a 
reference for other mosquito researchers by providing a simple method for identifying where genes are expressed 
in the adult, however, in addition our resource will also provide insights into the evolutionary diversity associated 
with gene expression levels among species. 



Background 

For organisms in which large-scale mutagenic studies 
are problematic, gene expression catalogues are an 
important tool for annotating processes on a gene-by- 
gene basis. In the malarial vector Anopheles gambiae, 
studies have focused on differential expression in males 
and females [1,2], on samples collected before and after 
the bloodmeal [2,3] and in dissected tissues such as the 
midgut [2], salivary glands [4,5], ovaries [2,6], head and 
carcass [7,8]. However, since these studies often involve 
different mosquito strains, different experimental plat- 
forms and analysis by different statistical methods, 
comparison among treatments is challenging. Here, we 
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provide a comprehensive expression atlas and associated 
publicly available database, the MozAtlas (http://www. 
tissue-atlas.org), cataloguing the relative strength and 
specificity of gene expression in tissues of male and 
female mosquitoes using a single genome-wide platform, 
protocol and analysis. 

We employed transcriptional profiling to analyse RNA 
levels in whole body mosquito samples, eight separate 
somatic tissues (head, salivary gland, midgut, Malpighian 
tubules, thoracic and abdominal carcass) and the repro- 
ductive tissues (testis, accessory gland, ovary) of males 
and females separately. In common with the majority of 
sexually reproducing organisms. Anopheles has specia- 
lized reproductive traits. Of particular interest is the 
female-specific activity of blood-feeding, which provides 
protein for egg development and is a key determinant in 
Plasmodium transmission. In contrast, male mosquitoes 
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feed entirely on sugar, are not adapted for digesting 
blood and do not transmit malaria. Consequently, those 
tissues involved in acquiring, ingesting and digesting 
blood are expected to display substantial sexual 
dimorphism at the level of gene expression. 

In this paper we summarize the functions and 
sequence level divergence of genes with sexually 
dimorphic or tissue enriched expression patterns to 
determine which genes, if any, are rapidly evolving. In 
addition, by comparing Anopheles expression profiles 
with matched tissues in Drosophila melanogaster, we 
assess evolutionary conservation of expression profiles 
within the Diptera and identify genes recently evolved in 
Anopheles with tissue specific patterns of expression. 
Such traits provide ideal candidates for use in popula- 
tion control, where vital or fertility-related genes may be 
targeted by genetic knockout [9]. With the ongoing 
development of insect genetics it has become increas- 
ingly likely that some pest populations, including mos- 
quitoes, may be controlled with genetic modification 
[10-15]. 

Results 

Gene expression coverage 

We have analysed gene expression among Anopheles 
males and females using Affymetrix whole-genome 
microarrays. The microarray platform contains 16,942 
unique Anopheles probes corresponding to 10,622 of 
the annotated protein-coding genes, equating to 82% of 
the genes in the genome. Female tissues were dissected 
at 24 hour intervals for a three day period following the 
blood-meal to provide information on the relative 
strength and specificity of gene expression in adult mos- 
quito tissues throughout oogenesis. Equivalent male tis- 
sues were dissected from siblings in parallel. Array 
quality was first assessed by calculating the Pearson cor- 
relation coefficient between samples. Gene expression 
was highly similar among replicates (R>= 0.92), indicat- 
ing that variation in our experiment was low (Addi- 
tional File 1). While indicating biological replicates are 
highly consistent, individual-to-individual variation in 
gene expression will of course be masked by the effect 
of tissue pooling. After quality control, we detected 
expression of 10,031 probes corresponding to 7253 
unique genes across Anopheles tissues. Hierarchical 
clustering with probe intensities indicates good discri- 
mination of tissues, with expression distributed accord- 
ing to tissue and sex (Additional File 2). The fraction of 
expressed genes varied from 51% to 74% among sam- 
ples (Figure IB). Corresponding Drosophila organs ana- 
lyzed on a similar array platform using the same 
normalization procedure, found similar levels of relative 
gene activity. Approximately 20% of Anopheles tran- 
scripts in dissected samples are absent from whole-body 
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Figure 1 Global expression coverage. (A) The proportion of 
probes giving at least 3 out of 4 mismatcli calls in either male or 
female samples for tissue in the MozAtlas and FlyAtlas. (B) Tissue 
breadth. "House-keeping" genes were identified to have a tau- 
statistic under 0.15 (n = 909), and narrow expression a tau-statistic 
above 0.85 (n = 3446). Overall, only a third of genes were detected 
in all tissues. 



estimates, and only a third of transcripts are recorded 
across all tissues (Figure IB). 

Sexually dimorphic gene expression 

To investigate sexually dimorphic expression, a linear 
model was fit to male and female tissue samples. On the 
basis of differential expression, we identified probes as 
either male-biased or female-biased with a 2-fold change 
of intensity and statistical significance at the Q<0.05 level 
(Additional File 3). Overall, 54% of genes are sexually 
dimorphic in at least one organ, including a substantial 
degree of sex-biased expression in most somatic tissues. 
Of the 3924 sexually dimorphic genes, 72% are detected 
in whole-body male and female samples, with the 
remaining 28% only in dissected tissues (Figure 2A). Each 
tissue displays a moderate degree of sexual dimorphism, 
however, by and large, somatic tissues are closely related 
irrespective of sex when clustered according to expres- 
sion level (Figure 2B). Thus, each tissue exhibits a specific 
gene expression profile that is overlaid with sex-specific 
functions. Sexually dimorphic expression is most skewed 
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Figure 2 Sexually dimorphic expression. (A) The proportion of sexually dimorphic expression in each tissue versus total sexual dimorphism. 
The ratio of female- to male-biased expression is provided (femaleimale). All ratios deviate significantly from equality (Chi-squared test; P < 0.05). 
(B) Hierarchical clustering of probes among tissues and sex with Euclidean Distance; female (red), male (blue). Branch support was estimated 
with 10,000 bootstrapped replicates. Expression enrichment against carcass for the (C) ovary and (D) testis. Significant gonad enrichment is 
highlighted in dark grey (ANOVA; M > 2; Q < 0.05). Inset figures show the proportion of gonad enrichment which is also sexually dimorphic in 
whole-body samples, i.e. 51% female-biased (dark red); 7% male-biased expression (dark blue). 



in the head, with a high number of female-biased genes 
detected; in particular we found an over-representation 
of odorant receptor genes with female biased expression. 
However, overall, and in all other somatic tissues, there 
was approximately equal numbers of male- and female- 
biased genes. 

Sexual dimorphism at the gene expression level is asso- 
ciated with different and distinct functional categories 
(Additional File 4). For example, genes with 'digestion , 
'protein metabolism' and 'proteolytic' functions, espe- 
cially 'serine-type endopeptidase' are over-expressed in 
the midgut of females. Genes enriched for 'cellular home- 
ostasis', 'ligase activity' and 'transporter activity' are 
enriched in the female salivary gland, while the malpigh- 
ian tubules display an over-representation of genes asso- 
ciated with 'ion transportation'. In comparison, male- 
elevated genes are largely associated with 'carbohydrate 
metabolic activity', 'ion transporter activity' and 'iron ion 
binding' within the midgut, salivary gland and Malpigh- 
ian tubules, as well as the carcass. Ultimately, many of 
the genes elevated in either sex are of unknown function. 

Tissue specific gene expression 

In the somatic and reproductive organs examined, a 
subset of genes showed considerable specificity (Figure 



3A). The highest proportion of tissue-specific expression 
occurs in the testis, where approximately 10% of tran- 
scripts are unique. In comparison, ovary specific genes 
account for -4% of expression in the tissue, and several 
of the ovary-specific genes are members of the chorion 
family. We also found a set of 54 accessory-gland 
expressed genes, absent from other tissues, representing 
-2% of the expression in this tissue (Figure 3A). In com- 
mon with the Drosophila Acps, our Anopheles candidates 
are over-represented in the top 10% of intensity values 
recorded for the accessory gland (x^ = 9.45; d.f.= 1; P < 
0.003), and many contain secretory domains necessary 
for transfer to females. Non-reproductive tissues also 
have a substantial number of genes with specific expres- 
sion patterns, the majority in a single sex: these are espe- 
cially prevalent in the midgut, salivary gland and carcass. 

Previous studies indicate that genes with restricted 
expression have elevated rates of sequence divergence 
amongst related species [16]. We conducted a large- 
scale survey of SNP A/S ratios using data from dbSNP 
to determine if such genes were evolving rapidly in Ano- 
pheles [17]. First, 11,224 genes with at least one coding 
SNP were collected. In total, -100,000 coding-region 
SNPs and 316,043 intronic SNPs were identified, corre- 
sponding to SNP densities of 5.6 and 7.19 SNPs, 



Baker et al. BMC Genomics 201 1, 12:296 
http://www.biomedcentral.eom/1 471-21 64/1 2/296 



Page 4 of 1 2 



B 



Salivary Glands 
Malpighian Tubules 
Midgut 
Head 
Carcass 
Testis 
Ovary 
Accessory Glands 



I 



61 



[| 

n 



□ 
□ 



I — I — I — I — I — I 

0 2 4 6 8 10 
Tissue % Specificity 



0.4 



— 1— 

0.6 



0.2 0.4 0.6 0.£ 
SNP A/S Ratio 



—1 

1.0 



Figure 3 Tissue-specific expression patterns. (A) Tissue specific 
transcription was investigated on tine basis of probe detection (3 
out of 4 mismatcli calls). Instances where probes were detected in a 
single tissue and a single sex are also highlighted: Female (red), 
male (blue). (B) SNP A/S ratio for genes with tissue-specific 
expression. 95% C.I. was estimated with 10,000 bootstrapped 
replicates. 



respectively, per 1,000 nucleotides. For our entire data- 
set, the number of non-synonymous coding SNPs per 
non-synonymous site (A) was 0.0033, the number of 
synonymous coding SNPs per synonymous site (S) was 
0.0068, and the A/S ratio was 0.49. 

SNP A/S estimates of < 1 suggest that most nucleotide 
substitutions have been eliminated by selection, i.e. puri- 
fying selection, whereas SNP A/S > 1 indicate that 
non-synonymous nucleotide substitutions have been 
maintained, i.e. positive selection. As expected, many tis- 
sue-specific genes display a higher ratio of A/S SNP 
ratio than those ubiquitously expressed throughout the 
organism, i.e. fewer non-synonymous mutations have 
been eliminated by selection and are evolving more 
rapidly (Figure 3B). For example, genes expressed in 
reproductive tissues including the testis, ovary and the 
male accessory gland have the highest rates of sequence 
divergence within Anopheles. Non-reproductive tissues 
including the head and Malpighian tubules show less 
deviation, while genes specifically expressed in the sali- 
vary gland and midgut have only marginally higher A/S 
SNP ratios than ubiquitously expressed genes. 

Chromosomal distribution of tissue expression 

Across a range of Metazoan species, genes with elevated 
male expression are non-randomly distributed around 
the genome [18]. However, in Anopheles, previous global 
estimates of sex-biased expression failed to identify this 
property [1]. Anopheles tissue dissections provide sub- 
stantially more information about male-specific gene 
expression than whole-body samples. For example, while 



a comparison of ovary and carcass expression indicates 
over half the ovary-enriched genes are female-biased in 
whole-body samples (Figure 2C), less than 10% of testis- 
enriched genes are male-biased, largely because they are 
undetected in whole-body samples (Figure 2D). Our 
new dataset allowed us to revisit the issue of genome 
position and expression in reproductive and somatic tis- 
sues. We found that genes expressed in the testis, but 
not the ovary, are under-represented on the X chromo- 
some (Figure 4A). In addition, male-biased somatically- 
expressed genes are also under-represented on the X 
chromosome (Figure 4B). We find that SNP polymorph- 
isms in testis-expressed genes show higher A/S ratios on 
the X chromosome than on the autosomes (x^ = 26.5 df 
= 1, P < 2.54 X 10'^; Figure 4C). Even though this find- 
ing is consistent with the expectation that X chromo- 
somes are hostile to testis-expressed genes, the same 
pattern was not observed with somatic tissues {y^ = 0.13 
df = 1, P = NS; Figure 4D). 

Comparative evolution with Drosophila melanogaster 

To estimate evolutionary divergence in tissue expression 
profiles, orthology relationships in Drosophila and Ano- 
pheles (Insecta: Diptera) were traced back to a common 
Metazoan ancestor; Tribolium casteneum (Insecta: 
Coleoptera), Apis melifera (Insecta: Hymenoptera) or Cae- 
norhabditis elegans (Nematoda: Rhabditida) (Figure 5A). 
From this analysis, we estimate that over half of the genes 
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Figure 4 Gene expression for major chromosome arms. (A) 

Germline only expression; testis (blue), ovary (red). (B) Somatic only 
expression; male (blue), female (red). (C) Germline only X vs 
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Figure 5 Orthology Relationships. (A) The oldest common ancestor in gene-families to either a Dipteran, Coleopteran, Hymenopteran or 
Metazoan ancestor. (B) Expression divergence of tissues for one-to-one orthology pairs (n = 4234). Euclidean distance was used to calculate 
similarity among tissues within and between species. Branch support was estimated with 10,000 bootstrapped replicates. Drosophila (grey); 
Anopheles (black). (C) Mean expression of Anopheles orthologue clusters and (D) mean expression of Drosophila orthologues clusters. Mean 
relative expression (RA) level for each cluster according to grayscale. (E) The number of overlapping orthologous genes between Anopheles and 
Drosophila expression clusters calculated with a hypergeometric probability distribution after multiple correction. Light grey (P < 0.05); Dark grey 
(P < 0.01). 
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in the Anopheles and Drosophila genomes are in 1:1 
orthology relationships (n = 6726; Additional File 5); -95% 
of which can be traced back to one of the outgroups used 
in our analysis with the remaining pairs specific to the 
Dipteran clade. Using these orthologues, we first com- 
pared mosquito expression with the same tissues in Droso- 
phila [19]. Rather than relying on an absolute measure of 
gene expression, relative measures of abundance (RA) 
were calculated for each gene (See Methods). Hierarchical 
clustering of RA across gene pairs showed that global pat- 
terns of expression in homologous organs were often 
more similar between species than between unrelated tis- 
sues within a species (Figure 5B). For some organs {i,e, 
ovary, gut carcass, head), a large proportion of transcrip- 
tional variation was conserved between Anopheles and 
Drosophila, suggesting that the underlying gene networks 
have similar functional constraints. 

To identify conserved expression signatures underling 
the above patterns, we used hierarchical clustering with 
pair-wise correlation coefficients to identif)^ co-expressed 
genes for each species. We chose clusters with an aver- 
age similarity of greater than 0.8 and more than 50 
genes for further analysis. Overall, 11 clusters meet 
these criteria in Anopheles and Drosophila, representing 
2884 and 2913 genes respectively (Figure 5C-D; Addi- 
tional File 6). Adjusting these thresholds, changes the 
number of groups identified, but were selected to pro- 
vide a dataset with reasonably sized gene clusters of 
highly similar expression profiles. 

Between species, we evaluated orthologues in each 
cluster and found several groups with significant overlap 
(Figure 5E). Typically, co-expression groups are elevated 
in one or two tissues. For example, a significant number 
of orthologues are expressed in the head of both the 
Anopheles Al cluster and the Drosophila D7 cluster 
(Figure 5E). Gene Ontology (GO) annotations for these 
genes are enriched for 'phototransduction' and 'signal 
transduction', indicating a close associated with normal 
physiological functions within head (Table 1). We also 
found conserved signatures that correspond to expres- 
sion in the Malpighian tubules, midgut and carcass. A 
notable exception is that Anopheles salivary gland 
expression (A4), shares most enrichment with Droso- 
phila orthologues from the male accessory gland (Dl, 
D2). Overall, the largest clusters are expressed in repro- 
ductive tissues (Figure 5C-D). Orthologues with testis 
expression in Anopheles, are spread over a number of 
Drosophila clusters. We further note that a large pro- 
portion of orthologues are expressed in the female 
ovary. Typically, clusters with elevated ovary expression 
show significant overlap between Anopheles and Droso- 
phila, and as expected, over-represented GO annota- 
tions involve basic cellular processes (Table 1). 



Table 1 Orthology cluster overlap, tissue expression and 
enriched gene ontology annotations 



Anopheles 


Drosophila 


Tissue 


GO 


Al 


D7 


Head 


pliototransduction 
signal transduction 


A2 
A3 


D9 

D8, D9 


Carcass 


metabolic process 
cellular respiration 


A4 


Dl, D2 


SG, AG 


protein folding 

signal peptide processing 


A5 
A6 
A7 


DIO 
Dll 

DIO, Dll 


MT 

Midgut 
Midgut/MT 


transmembrane transport 
carbohydrate metabolic process 


A9 


D3, D4 


Testis 


microtubule-based process 
spermatogenesis 


AlO, All 


D6 


Ovary 


nucleic acid metabolic process 
oogenesis, cell cycle 
eggshell formation 



SG = salivary gland; AG = Accessory Gland; MT = Malpighian tubules. 
Gene ontology significance level (P < 0.01). 



Single-copy and multi-copy gene families 

Our phylogenetic analysis indicated that 5932 families 
contain a single Anopheles gene, whereas another 971 
families show evidence of Anopheles expansion. In this 
latter set, duplications with narrow expression patterns 
(i.e. tau-statistic > 0.85), often arose during the Dip- 
teran split and are most prevalent in the male testis 
(Figure 6A). However, as well as the testis, a high inci- 
dence of duplication events are genes with salivary 
gland, midgut or Malpighian tubule restricted expres- 
sion (Figure 6A). Within single-copy families, 143 
groups are narrowly expressed in the same Anopheles 
and Drosophila tissues (Figure 6B). Such expression is 
prevalent with head and testis expression, but while 
these genes might be expected to evolve rapidly, the 
majority date back to Metazoan and Hymenopteran 
clades. 

Online MozAtlas Database 

For researchers interested in comparing their own 
experiments to the MozAtlas, we have constructed an 
online database and web-browser for querying tissue 
expression in Anopheles (http://www.tissue-atlas.org). 
The single gene query displays tables of normalized 
expression for each probe and tissue available. In addi- 
tion, this search displays available orthology relations, a) 
one-to-one Drosophila melanogaster orthologues and 
corresponding relative gene expression estimates, and b) 
a gene tree of all mosquito, fly and outgroups within the 
gene family. We also provide a BLAST and batch 
searching facilities to output expression values for larger 
lists of genes that may then be used for further down- 
stream analysis. 
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Figure 6 Gene copies, family origins and tissue expression. {fii)Anopheles gene expansions with restricted expression patterns (n = 325; tau- 
statistic = 1). (B) Single-copy gene-families with narrow spatial expression profiles in both Drosophila and Anopheles tissues (n = 143; tau-statistic 
> 0.85). 



Discussion 

To help improve the functional annotation of the Ano- 
pheles gambiae genome we have generated the MozA- 
tlas, a unified catalogue of tissue-specific gene 
expression from a single mosquito strain. In Drosophila 
melanogaster, cataloguing tissue expression patterns has 
been useful, especially for inferring biological functions, 
since the majority of genes encoded in the genome are 
not ubiquitously expressed [19]. As with the fruit fly. 
Anopheles gene expression also exhibits substantial tis- 
sue specificity, with only a third of detectably expressed 
genes found in all tissues. Thus, the MozAtlas is a useful 
resource for better understanding the mosquito genome, 
providing direct evidence of genes with tissue restricted 
expression. Below we highlight the utility of MozAtlas 
for identifying classes of gene with tissue or sex-biased 
expression that may be exploited for vector control. 
Analysis of the MozAtlas also identifies gene expression 
features that are of interest from an evolutionary per- 
spective, revealing both highly conserved and species- 
specific aspects of insect biology. Of particular interest, 
given that malaria parasites are only transmitted through 
female mosquitoes, we separately catalogued gene 
expression for each tissue in males and females, thus 
providing both tissue and sex-specific views of gene 
expression in the adult. 

A major finding from our analysis is the substantial 
degree of sexually dimorphic gene expression we find at 
the tissue level: more than half of the genes for which 
we detect expression exhibit sexual dimorphism in 



terms of expression level. The head, in particular, has a 
significantly higher number of female-biased genes and 
of these, odorant receptors are significantly over-repre- 
sented (Additional File 4). When searching for a blood- 
meal, female mosquitoes are attracted to odours emitted 
by humans, a behaviour mediated by receptors in the 
antennal sensilla [7]. This activity is not exhibited by 
males, who feed entirely on nectar, and we presume that 
the female elevated expression of odorant binding mole- 
cules reflect this biology. The identification of molecules 
associated with female-specific aspects of odorant detec- 
tion may provide targets for controlling malaria trans- 
mission [20]. 

We identified other sexually dimorphic expression sig- 
natures that appear to be associated with female charac- 
teristics, in particular, adaptation to hematophagy. For 
example in the female salivary gland we found an over- 
representation of genes with protein and lipid catabolic 
activity, ion transport and cellular homostasis functions. 
We suggest that these reflect the fact that, in females, 
the salivary gland produces compounds to disarm host 
hemostatic and immune responses, thus allowing mos- 
quitoes to take a blood-meal. Similarly, many proteins 
found in the midgut are only synthesized by blood-feed- 
ing females [3,21]: numerous digestive and proteolytic 
molecules implicated in blood digestion were identified 
as female elevated in our analysis. 

In contrast, elevated male gene activity is largely asso- 
ciated with carbohydrate metabolism and ion transport 
activity. Since male mosquitoes feed entirely on sugar. 
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these results were not surprising. However, somewhat 
more novel is that iron binding molecules are up-regu- 
lated in males. While in female mosquitoes iron is espe- 
cially important for egg development and is strongly 
influenced by blood-feeding [22], iron metabolism has 
diverse physiological and developmental roles [23]. 
Although females obtain iron from the blood meal, the 
sugar diet of males may necessitate more efficient iron 
uptake and up-regulation of genes that encode iron 
binding functions. 

In both somatic and reproductive tissues, we identified 
genes with considerable specificity. Tightly controlled, 
tissue-specific expression is of interest for understanding 
the basic biology of a species, and is likely to be key in 
the development of next generation insect control 
agents. For example, genes uniquely expressed in parti- 
cular tissues could be targets for inducing sterility or 
providing regulatory elements to drive localised expres- 
sion of transgenes. In this respect, the highest propor- 
tion of Anopheles tissue-specific expression is in the 
testis, with approximately 10% of transcription uniquely 
detected in this tissue. Testis specific expression of 
genes with important roles in spermatogenesis, sperm 
competition or sperm-egg interactions present a set of 
targets with potential for inducing male sterility. 

After mating. Anopheles females undergo distinct 
behavioural and physiological changes due to the trans- 
fer of both sperm and proteins produced in the male 
accessory glands [24]: proteins secreted by males and 
passed to females in seminal fluid could provide a route 
for altering female fertility. Via specific expression pro- 
filing of accessory glands we have identified a new set of 
potential Anopheles Acp genes that will enable further 
investigation of sexual conflict within the mosquito. Sex- 
ual antagonism between males and females may be 
expected to cause rapid Acp sequence evolution [25]. 
We find that among tissue-specific genes, those 
expressed in the accessory gland have a higher A/S ratio 
than in many tissues, including the testis. Slower evolu- 
tionary rates in the Anopheles testis might be explained, 
in part, by their mating behaviour: in polyandrous 
insects genes involved in spermatogenesis are often 
under strong positive selection as a result of post-copu- 
latory male-male competition [25], whereas these pres- 
sures in the testis are expected to be absent from the 
largely monandrous Anopheles mosquitoes [26]. 

Genes with ovary specific expression provide potential 
targets for inducing female sterility in mosquitoes given 
that they are closely associated with egg formation. 
Chorion components of the fruit fly eggshell, for exam- 
ple, provide the embryo with protection from the physi- 
cal environment, and disrupting their function causes 
female sterility [27]. Recently, proteomic techniques 



have identified Anopheles eggshell constituents, several 
of which we find to be specifically expressed in the 
ovary, making them favourable candidates for use in 
population control [28]. 

In terms of genome structure, we show that genes 
with male-biased expression are non-randomly distribu- 
ted around the Anopheles genome. Two mechanisms 
have been proposed to explain the disparity in chromo- 
somal distribution of male expressed genes. First, during 
spermatogenesis the X chromosome of males becomes 
inactivated: since few testis genes are expressed post- 
meiotically, evidence suggests that chromosomal inacti- 
vation has promoted autosomal duplication events from 
X-linked genes [18,29,30]. There is compelling evidence 
that X-linked inactivation also occurs in nematodes [31] 
and mammals [32], however, an under-representation of 
male-biased somatically-expressed genes on the X chro- 
mosome indicates that other forces are also at work. 
Second, since males only have one X chromosome, poly- 
morphisms beneficial to one sex may arise that are det- 
rimental to the other sex. Such antagonistic sexual 
selection may eventually lead to sequence changes and 
demasculinization of the X chromosome [33], and con- 
sistent with this expectation, genes on the Anopheles X 
chromosome have less sequence polymorphism than on 
the autosomes. 

Identifying expression divergence within and between 
closely-related species provides important insights into 
the selective pressures underlying gene regulation 
[34,35]. The opportunity to compare divergence between 
Drosophila and Anopheles, separated by some 250 mil- 
lion years of evolution, allows us to explore gene and 
tissue evolution over a considerable time scale. We find 
that expression similarity in one-to-one orthologues of 
the midgut, head, carcass and ovary expressed genes is 
well conserved in the Diptera and, as expected, genes in 
conserved co-expression clusters perform integral phy- 
siological functions. 

In contrast, tissues such as the testis, often show con- 
siderable transcriptional variation between closely 
related species [36,37]. It's been proposed that testis 
gene regulation plays a critical role in the initial forma- 
tion of reproductive isolation [38]. In addition to the 
Anopheles testis, expression in other tissues is also 
highly divergent: for example, expression in the Mal- 
pighian tubules is largely not conserved between Ano- 
pheles and Drosophila, As an organ with a key role in 
detoxification and osmoregulation, this divergence may 
reflect fundamental differences in the diet of each insect 
[39]. In addition, salivary gland and male accessory 
gland expression cluster within rather than between spe- 
cies, evidence for a bout of simultaneous evolution since 
the last common ancestor was shared. Indeed, no 
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significant co-expression was detected between species, 
indicating that secretory organ functions have diverged 
during the Dipteran split. 

Recent Anopheles gene duplications are often expressed 
in the testis and, in Drosophila, extreme expansions also 
have spermatogenesis-related functions [40]. As well as 
the testis, other tissues display narrow expression profiles 
of recent origin in Anopheles, Certainly, the blood meal 
imposes a range of challenges on the digestive system of 
mosquitoes and, in part, explains a predominance of 
gene duplications with salivary gland, Malpighian tubule 
or midgut expression. Even between members of the 
same mosquito subgenera, salivary proteins can diverge 
rapidly over time [41]: our data suggests that this evolu- 
tionary pattern may also be common in Malpighian 
tubule proteins and, to a lesser extent, proteins within 
the midgut. However, specifically expressed genes in 
large families do not necessarily highlight unique func- 
tions, since homologues may perform the same or similar 
functions in a larger set of tissues. Gene families with sin- 
gle members are of interest for identifying unique pro- 
cesses, given that closely related homologues are not 
found within the genome. Narrowly expressed single- 
copy families were detected dating back to Metazoan and 
Hymenopteran clades, perhaps accompanying the emer- 
gence of differentiated organs. It will be of considerable 
interest for insect control programs to determine 
whether such proteins perform integral functions in their 
specific tissues, given that as single copies they should 
perform unique roles within the organism. 

Conclusions 

We have generated a tissue and sex-specific gene 
expression atlas for Anopheles gamhiae and used it to 
explore mosquito biology related to reproduction, feed- 
ing and gene evolution. Given that Anopheles is the 
major vector of one of the world's most debilitating dis- 
eases, our dataset provides an important reference for 
other mosquito researchers wishing to explore potential 
roles for genes of interest. Of particular importance is 
the identification of uniquely expressed genes that may 
serve as tissue-specific drivers in transgenic constructs 
or potential knockout targets in the next generation of 
insect control agents. 

Methods 

RNA collections and microarray platform 

Male and female mosquito siblings were separated at 
pupation and allowed to emerge into separate cages to 
prevent mating. 3-day old, non-mated females were 
blood-fed and female tissues were dissected at 24 hour 
intervals for a three day period following the blood- 
meal. Equivalent male tissues were dissected from age- 
matched siblings in parallel. Dissections were carried 



out in phosphate-buffered saline using dissecting needles 
and a 28 gauge needle to cleanly separate connected tis- 
sues from each other. 'Midgut' samples were dissected 
clear of the foregut, hindgut and malphigian tubules to 
include the anterior midgut and stomach. 'Head' sam- 
ples were produced by severing at the neck and include 
brain, eyes, cuticle and some fat body. 'Ovary' samples 
include both ovaries and the common oviduct. 'Salivary 
gland' samples include the salivary duct, lateral lobes 
and median lobe. Salivary glands were rinsed extensively 
in PBS to remove the majority of fat body associated 
with the glands. 'Carcass' includes the thoracic and 
abdominal carcass and all tissues therein excluding 
those tissues individually described in the MozAtlas. 
Dissected tissues were placed immediately in Trizol to 
minimize the impact of dissection on the transcriptome. 
For each of four biological replicates, tissues were 
pooled from a minimum of 10 mosquitoes dissected at 
each time point. For each tissue and sex, an equal quan- 
tity of total RNA was pooled from three time points 
sampled after the blood-meal to obtain gene expression 
estimates throughout oogenesis (24, 48, 72 hrs). Each 
RNA sample (50 ng) was subsequently amplified in two 
cycle cDNA target labelling to generate biotinylated 
cRNA probes for hybridization on to Affymetrix micro- 
arrays [42]. 

Estimates of gene expression 

Oligonucleotide probes and genes were mapped to 
AgamPS genome assembly. Unless otherwise stated, 
datasets were analyzed with the R statistical program- 
ming language using programs maintained as part of the 
Bioconductor suite [43]. In addition to microarray data- 
sets for Anopheles, matching tissues obtained from the 
Drosophila FlyAtlas were re-analyzed with the same nor- 
malization procedure (GEO: GSE1690; GSE7763). Inten- 
sity values between arrays were first standardized within 
tissues for each species separately using the robust 
multi-array analysis package [44,45]. The expression 
presence and absence calls were assessed with the signal 
to noise ratio of the perfect match and mismatch probes 
provided on Affymetrix arrays. Probes were used in 
further analysis only if they were deemed to be present 
in at least three tissue replicates. All estimates of differ- 
ential expression were adjusted for multiple testing 
using the false discovery rate method [46]. Array data 
has been submitted to the Gene Expression Omnibus 
under GSE21689. 

Sexual dimorphism and tissue specificity 

Sexual dimorphism was determined with a linear model 
of gene expression fit to male and female samples for 
each tissue as implemented in the LIMMA library [47]. 
On the basis of differential expression, we subsequently 
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identified probes as either male-biased or female-biased 
where there was a significant 2-fold change of intensity 
in one sex, in addition to statistical significance at the 
Q< 0.05 level (Additional File 3). Two measures of tis- 
sue specificity were also calculated: probe detection and 
tissue expression breadth. Probes were deemed tissue- 
specific if at least 3 out of 4 mismatch calls were found, 
but only in a single tissue and sex. In comparison, tissue 
breadth was measured by normalizing against maximal 
expression to generate the tau-statistic [48]. The result- 
ing tau-statistic falls within the range of 0 to 1, in 
which higher values indicate greater tissue-biased 
expression. Anopheles and Drosophila Gene Ontology 
annotations (Biological Process, Molecular Function, 
Cellular Component) and the enrichment of functions 
were determined using FlyMine with a 1% false discov- 
ery rate for multiple testing correction [49]. 

SNP polymorphism 

When sequences are available for multiple individual in 
a species, the ratio of observed non-synonymous muta- 
tion rate (A) to the synonymous mutation rate (S) can 
be utilized as an estimate of the selective pressure. To 
estimate sequence polymorphism within Anopheles we 
conducted a large-scale survey of dbSNP [17]. While it 
is not possible to measure selective constraint on indivi- 
dual proteins directly using this approach, it has been 
demonstrated that when a group of genes are measured 
together, estimates of variation are robust and in good 
agreement with A/S for divergence [50]. 

Expression divergence 

Since microarray platforms were designed separately for 
Drosophila and Anopheles, probes have different affi- 
nities to their target RNAs, making the normalization of 
orthology expression between chips difficult. In order to 
compare tissue expression profiles between species, each 
gene was represented as a vector of relative expression 
abundance (RA) across the sampled tissues to avoid 
over-estimating divergence based on absolute expression 
intensity. Where genes are represented by multiple 
probes, the maximum intensity value recorded in each 
tissue was used for subsequent analysis. Since the FlyA- 
tlas does not have separate samples for males and 
females, we combined male and female samples in the 
MozAtlas to make comparisons between species. Hier- 
archical clustering of orthologues was performed with 
measures of RA within and between species. For gene- 
wise clustering, we used Pearson correlation coefficient 
as the distance measure and defined similarity between 
clusters using average-linkage clustering. Co-regulated 
genes were defined as any group with an average simi- 
larity of greater than 0.8 that also contained more than 
50 genes. Among species clusters, orthologue overlap 



was subsequently investigated with a hypergeometric 
probability distribution to determine enrichment. 

Orthology classification 

DNA and protein sequences were obtained for D. mel- 
anogaster and A. gamhiae (Ensembl v50) [51], Tribo- 
lium casteneum (Version 3; BeetleBase) [52], Apis 
melifera (Version 2; BeeBase) [53] and Caenorhabditis 
elegans (wsl60; Ensembl v50) [51]. One-to-one orthol- 
ogy relationships were determined using Inparanoid 
with default parameters, we selected the longest avail- 
able translation for each annotated protein [54]. Best 
reciprocal hits between species were grouped together 
into broader gene-families, and the sequences aligned 
with MUSCLE [55]. Tree topologies were subsequently 
reconstructed with both dS (synonymous substitution 
rate), dN (nonsynonymous substitution rate), nucleo- 
tide and protein distance measures using TreeBest 
[56,57]. From back-translation of protein alignments, 
TreeBest creates a consensus tree by merging the 
results of neighbour joining and maximum likelihood 
(ML) trees. By default, ML trees based on protein 
alignment are built under the WAG model, while ML 
tree based on DNA are built under the HKY model, 
which models non-uniform base composition and tran- 
sition/transversion rate bias [58]. Orthology relation- 
ships are described as one-to-one, one-to-many and 
many-to-many gene relationships. 
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